πŸ•·οΈοΈ Job Radar β€’ SCRAPING

Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.

upwork.com 🟒 2026-05-16

πŸ”Ή [Target] Real Estate Deed Intelligence Pipeline [Method] Refactor & Automate [UI/UX] N/A [Stack] Python, Playwright, DuckDB, Pandas, Dagster [Security] N/A [Format] JSON
πŸ‘€ Client: πŸ‡ΊπŸ‡Έ USA Member since 2025-01-28
πŸ’° Price: ****
🚩 Problem: Refactor and automate an existing real estate deed intelligence pipeline to ensure clean, maintainable daily operations.
πŸ“¦ Existing: Working or mostly-working Python deed scrapers for Miami-Dade, Broward, and Palm Beach County, Florida; source-specific notes and raw artifacts; commercial-property filtering logic; report/export code; architecture notes; normalization/data-cleaning notes; reviewer support on the client's side.

Specifications:

[Target] Real Estate Deed Intelligence Pipeline
[Method] Refactor & Automate
[UI/UX] N/A
[Stack] Python, Playwright, DuckDB, Pandas, Dagster
[Security] N/A
[Format] JSON

Workflow:

1. Inspect the existing project handoff.
2. Confirm the existing Miami-Dade, Broward, and Palm Beach deed scrapers run.
3. Package/wrap the existing scrapers into one clean Python pipeline.
4. Preserve raw source artifacts and all captured fields.
5. Store raw and normalized data in DuckDB or SQLite.
6. Build a central dirty-data normalization layer so cleanup logic is not scattered across random scripts.
7. Normalize deed/property/entity fields such as: seller/grantor names, buyer/grantee names, LLC/entity names, trust names, individual names, addresses, APN/paicle/folio values, document types, sale dates, sale prices/consideration values, property-use or asset-class codes when available.
8. Keep county/source separation clean so Miami-Dade, Broward, Palm Beach, and future counties do not become a messy combined blob.
9. Calculate 1031 45-day identification-window fields.
10. Keep or improve the existing commercial-property filtering logic.
11. Add dedupe and basic entity/property matching starter logic.
12. Add ambiguity/manual-review output for uncertain matches.
13. Generate CSV export.
14. Generate XLSX export.
15. Generate HTML broker-facing report.
16. Generate JSON feed for an existing website/private room.
17. Add validators so bad, stale, duplicate, empty, or malformed data does not silently publish.
18. Add logging, run history, and clear error handling.
19. Wire the pipeline into Dagster if practical.

⚑ Receive notifications instantly Join our community.